-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: improve background task completion detection and message extraction #638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: improve background task completion detection and message extraction #638
Conversation
- Fix TS2742 by adding explicit ToolDefinition type annotations - Add stability detection (3 consecutive stable polls after 10s minimum) - Remove early continue when sessionStatus is undefined - Add silent notification system via tool.execute.after hook injection - Change task retention from 200ms to 5 minutes for background_output retrieval - Fix formatTaskResult to sort messages by time descending Fixes hanging background tasks that never complete due to missing sessionStatus.
- Add stability-based completion detection (10s min + 3 stable polls) - Fix message extraction to recognize 'reasoning' parts from thinking models - Switch from promptAsync() to prompt() for proper agent initialization - Remove model parameter from prompt body (use agent's configured model) - Add fire-and-forget prompt pattern for sisyphus_task sync mode - Add silent notification via tool.execute.after hook injection - Fix indentation issues in manager.ts and index.ts Incorporates fixes from: - PR code-yeongyu#592: Stability detection mechanism - PR code-yeongyu#610: Model parameter passing (partially) - PR code-yeongyu#628: Completion detection improvements Known limitation: Thinking models (e.g. claude-*-thinking-*) cause JSON Parse errors in child sessions. Use non-thinking models for background agents until OpenCode core resolves this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 50693f3e99
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (p.type === "text" && p.text && p.text.trim().length > 0) || | ||
| // Reasoning content (thinking blocks) | ||
| (p.type === "reasoning" && p.text && p.text.trim().length > 0) || | ||
| // Tool calls (indicates work was done) | ||
| p.type === "tool" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Treat tool_result as output in completion check
The new completion gate only treats text, reasoning, or tool parts as “content”. In OpenCode sessions, tool outputs are emitted as tool_result parts (see how other code paths already parse tool_result), so background tasks that only run tools (e.g., grep/glob/bash) will keep hasContent false and will never pass the idle/stability checks, eventually timing out after 30 minutes despite having results. This check should consider tool_result (and its content) as valid output to avoid false “no output” loops.
Useful? React with 👍 / 👎.
| const allComplete = !pendingSet || pendingSet.size === 0 | ||
| const remainingCount = pendingSet?.size ?? 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Count resumed tasks in batched completion logic
The “all tasks complete” decision now hinges on pendingByParent, but that set is only populated in launch(); resumed or externally-registered tasks are never added. As a result, if a user resumes multiple background sessions (or mixes resumed with launched tasks), allComplete can become true while other resumed tasks are still running, causing the hook to announce “[ALL BACKGROUND TASKS COMPLETE]” prematurely and breaking the “silent until all complete” contract. Consider adding resumed/external task IDs to pendingByParent so batching reflects all active tasks.
Useful? React with 👍 / 👎.
…/external tasks Addresses code review feedback from PR code-yeongyu#638: P1: Add tool_result type to validateSessionHasOutput() to prevent false negatives for tool-only background tasks that would otherwise timeout after 30 minutes despite having valid results. P2: Add pendingByParent tracking to resume() and registerExternalTask() to prevent premature 'ALL COMPLETE' notifications when mixing launched and resumed tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 issues found across 11 files
Confidence score: 2/5
- Sorting in
src/tools/call-omo-agent/tools.tsusesString(a.info?.time)even thoughtimeholds nestedcreatednumbers, so ordering breaks the moment those objects appear—this is a concrete regression risk for tool selection. src/features/background-agent/manager.tsomitstool_resultin output validation, meaning successful tool-heavy background runs will be flagged invalid and short-circuited, a direct user-facing failure.- The same manager ignores caller-provided
modelsettings and fails to track resumed tasks inpendingByParent, so category-specific configs are lost and batch completion signals fire prematurely—both issues affect correctness rather than mere logging. - Pay close attention to
src/features/background-agent/manager.ts,src/tools/call-omo-agent/tools.ts- multiple functional regressions need resolution.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/features/background-agent/manager.ts">
<violation number="1" location="src/features/background-agent/manager.ts:137">
P2: Misleading log message: says "promptAsync" but code actually calls `prompt()`. Update to reflect the actual method being called.</violation>
<violation number="2" location="src/features/background-agent/manager.ts:145">
P1: BackgroundManager ignores the `model` provided by callers (e.g., Sisyphus category configs), so category-specific model settings are silently dropped and all delegated sessions run with the agent’s default model.</violation>
<violation number="3" location="src/features/background-agent/manager.ts:291">
P2: Misleading log message: says "promptAsync" but code actually calls `prompt()`. Update to reflect the actual method being called.</violation>
<violation number="4" location="src/features/background-agent/manager.ts:291">
P2: Resumed tasks are not tracked in `pendingByParent`, causing premature batch completion notifications. When a task is resumed, it should be added to the pending set just like launched tasks. Without this, calling `resume()` and then having other launched tasks complete will trigger "[ALL BACKGROUND TASKS COMPLETE]" while the resumed task is still running.</violation>
<violation number="5" location="src/features/background-agent/manager.ts:480">
P1: Missing `tool_result` part type in output validation. Background tasks that primarily run tools (grep, glob, bash) emit results as `tool_result` parts, but this check only looks for `text`, `reasoning`, and `tool`. This will cause tool-heavy tasks to never pass the output validation, eventually timing out after 30 minutes despite having valid results.</violation>
</file>
<file name="src/tools/call-omo-agent/tools.ts">
<violation number="1" location="src/tools/call-omo-agent/tools.ts:191">
P1: Sorting by `String(a.info?.time)` is incorrect—`time` is an object `{ created?: number }`, so this converts to `"[object Object]"` and breaks sorting. Use `a.info?.time?.created` instead, consistent with similar code in `sisyphus-task/tools.ts` and `look-at/tools.ts`.</violation>
</file>
<file name="src/tools/sisyphus-task/tools.ts">
<violation number="1" location="src/tools/sisyphus-task/tools.ts:440">
P2: Race condition: the 100ms delay is insufficient to catch asynchronous API errors. If `client.session.prompt()` rejects after this delay, `promptError` is set but never checked again—the polling loop continues for up to 10 minutes before returning a misleading error. Consider checking `promptError` periodically within the polling loop.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| // Reasoning content (thinking blocks) | ||
| (p.type === "reasoning" && p.text && p.text.trim().length > 0) || | ||
| // Tool calls (indicates work was done) | ||
| p.type === "tool" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Missing tool_result part type in output validation. Background tasks that primarily run tools (grep, glob, bash) emit results as tool_result parts, but this check only looks for text, reasoning, and tool. This will cause tool-heavy tasks to never pass the output validation, eventually timing out after 30 minutes despite having valid results.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 480:
<comment>Missing `tool_result` part type in output validation. Background tasks that primarily run tools (grep, glob, bash) emit results as `tool_result` parts, but this check only looks for `text`, `reasoning`, and `tool`. This will cause tool-heavy tasks to never pass the output validation, eventually timing out after 30 minutes despite having valid results.</comment>
<file context>
@@ -384,6 +438,62 @@ export class BackgroundManager {
+ // Reasoning content (thinking blocks)
+ (p.type === "reasoning" && p.text && p.text.trim().length > 0) ||
+ // Tool calls (indicates work was done)
+ p.type === "tool"
+ )
+ })
</file context>
| p.type === "tool" | |
| p.type === "tool" || | |
| // Tool results contain actual output from tool calls | |
| p.type === "tool_result" |
| log("[background-agent] Resuming task:", { taskId: existingTask.id, sessionID: existingTask.sessionID }) | ||
|
|
||
| this.client.session.promptAsync({ | ||
| log("[background-agent] Resuming task - calling promptAsync with:", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Resumed tasks are not tracked in pendingByParent, causing premature batch completion notifications. When a task is resumed, it should be added to the pending set just like launched tasks. Without this, calling resume() and then having other launched tasks complete will trigger "[ALL BACKGROUND TASKS COMPLETE]" while the resumed task is still running.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/background-agent/manager.ts, line 291:
<comment>Resumed tasks are not tracked in `pendingByParent`, causing premature batch completion notifications. When a task is resumed, it should be added to the pending set just like launched tasks. Without this, calling `resume()` and then having other launched tasks complete will trigger "[ALL BACKGROUND TASKS COMPLETE]" while the resumed task is still running.</comment>
<file context>
@@ -261,7 +288,15 @@ export class BackgroundManager {
log("[background-agent] Resuming task:", { taskId: existingTask.id, sessionID: existingTask.sessionID })
- this.client.session.promptAsync({
+ log("[background-agent] Resuming task - calling promptAsync with:", {
+ sessionID: existingTask.sessionID,
+ agent: existingTask.agent,
</file context>
| log("[background-agent] Resuming task - calling promptAsync with:", { | |
| // Track resumed task for batched notifications | |
| const pending = this.pendingByParent.get(existingTask.parentSessionID) ?? new Set() | |
| pending.add(existingTask.id) | |
| this.pendingByParent.set(existingTask.parentSessionID, pending) | |
| log("[background-agent] Resuming task - calling promptAsync with:", { |
…sorting, race condition - Fix misleading log messages: 'promptAsync' -> 'prompt (fire-and-forget)' - Restore model passthrough in launch() for Sisyphus category configs - Fix call-omo-agent sorting: use time.created number instead of String(time) - Fix race condition: check promptError inside polling loop, not just after 100ms
…ing and sync mode ## Summary This commit represents the culmination of extensive debugging and refinement of the background agent and sisyphus_task systems, building upon changes from PRs code-yeongyu#592, code-yeongyu#610, code-yeongyu#628, code-yeongyu#638, code-yeongyu#648, code-yeongyu#649, code-yeongyu#652, and code-yeongyu#653. ## Investigation Journey ### Initial Problem Background tasks were getting stuck indefinitely. User config model overrides were being ignored for certain agent types. ### Root Cause Analysis 1. Discovered validateSessionHasOutput and checkSessionTodos guards were blocking completion even when tasks had finished 2. Found that sync mode (run_in_background=false) was NOT passing categoryModel to session.prompt(), while async mode was 3. Traced config loading path and found JSONC files weren't being detected 4. Analyzed kdcokenny/opencode-background-agents for reference implementation ### Trial and Error Log **Attempt 1: Add model to resume() in manager.ts** - Hypothesis: Resume needs to pass stored model - Result: REVERTED - PR code-yeongyu#638 intentionally removed this; agent config handles it **Attempt 2: Add userAgents lookup for subagent_type** - Hypothesis: Need to look up agent model from user config - Result: REVERTED - Agent model already applied at creation in config-handler.ts **Attempt 3: Add categoryModel to sync mode prompt** - Hypothesis: Sync mode missing model that async mode passes - Result: SUCCESS - This was the actual bug **Attempt 4: Add debug logging throughout pipeline** - Purpose: Trace model flow through config -> agent -> prompt - Files: 6 files with appendFileSync to omo-debug.log - Result: Confirmed fixes working, then REMOVED all debug logging **Attempt 5: Investigate clickable sessions** - Hypothesis: parentID should make child sessions clickable in UI - Result: parentID IS passed correctly, but sessions don't appear clickable - Analysis: kdcokenny uses same approach; may be OpenCode core limitation - Status: UNRESOLVED - Needs further investigation or OpenCode core change ## Background Agent Completion Detection (PR code-yeongyu#638) Simplified the completion detection logic that was causing tasks to get stuck: - Removed overly complex validateSessionHasOutput and checkSessionTodos guards - Tasks now complete after MIN_IDLE_TIME_MS (5s) elapsed on session.idle event - Added 15-minute global timeout (MAX_RUN_TIME_MS) to prevent runaway tasks - Pattern follows kdcokenny/opencode-background-agents reference implementation ## Model Override Architecture (PRs code-yeongyu#610, code-yeongyu#628, code-yeongyu#638) Established clear separation between category-based and agent-based model handling: | Path | Model Source | |-------------------|-------------------------------------------| | category=X | Explicit from category config (passed) | | subagent_type=X | Agent's configured model (at creation) | | resume | Agent's configured model (not passed) | Key insight from PR code-yeongyu#638: Don't pass model in prompt body for resume/subagent - let OpenCode use the agent's configured model set at creation time in config-handler.ts. ## Sync Mode Category Model Fix (NEW) Fixed critical bug where sync mode (run_in_background=false) with categories was NOT passing the categoryModel to session.prompt(): // BEFORE: Model not passed in sync mode body: { agent: agentToUse, system: systemContent, ... } // AFTER: Model passed when available body: { agent: agentToUse, ...(categoryModel ? { model: categoryModel } : {}), ... } This ensures category model overrides work consistently in both sync and async modes. ## JSONC Config File Support Extended config file detection to support both .json and .jsonc extensions: - getUserConfigDir() now checks for oh-my-opencode.jsonc first - Both cross-platform (~/.config) and Windows (%APPDATA%) paths support JSONC - Enables comments in config files for better documentation ## Test Improvements - Increased sync resume test timeout from 5s to 10s - Test was flaky because timeout = MIN_STABILITY_TIME_MS (race condition) - Added clarifying comments about timing requirements ## Code Cleanup - Removed unused 'os' imports from plugin-config.ts and config-handler.ts - Removed ALL debug logging (hardcoded paths, appendFileSync calls) - Added PR code-yeongyu#638 reference comments for future maintainers ## Verified Test Results (8/8 category + subagent tests pass) | Test | Type | Mode | Result | |-------------------|-------------|-------|--------| | quick | category | async | ✅ | | ultrabrain | category | async | ✅ | | most-capable | category | async | ✅ | | quick | category | sync | ✅ | | librarian | subagent | async | ✅ | | Metis | subagent | async | ✅ | | oracle | subagent | sync | ✅ | | quick + git-master| category | async | ✅ | ## Known Issues & Future Work ### 1. Explore Agent Hangs on Non-Exploration Tasks The explore agent hung when given a simple math query (5+5). This is NOT a regression - explore is a specialized codebase search agent (contextual grep) designed for queries like 'Where is X implemented?' not general Q&A. When given non-exploration tasks, it attempts to search for non-existent patterns and may hang indefinitely due to no max_steps limit. **Recommendation**: Add max_steps: 10 to explore agent config in future PR. ### 2. Clickable Child Sessions Not Working Sessions created via sisyphus_task pass parentID correctly, but don't appear as clickable child sessions in OpenCode's sidebar UI. Investigation showed: - parentID IS being passed to session.create() - kdcokenny/opencode-background-agents uses identical approach - Sessions exist and complete, just not rendered as clickable in UI **Recommendation**: May require OpenCode core change or UI setting discovery. ### 3. Prometheus Agent Correctly Blocked Prometheus (Planner) is a primary agent and correctly rejected when called via sisyphus_task with subagent_type. This is expected behavior - primary agents should only be invoked directly, not via task delegation. ## Files Changed - src/features/background-agent/manager.ts - PR code-yeongyu#638 reference comment - src/tools/sisyphus-task/tools.ts - Sync mode categoryModel fix - src/tools/sisyphus-task/tools.test.ts - Test timeout increase - src/shared/config-path.ts - JSONC extension support - src/plugin-config.ts - Cleanup unused import - src/plugin-handlers/config-handler.ts - Cleanup unused import
…/external tasks Addresses code review feedback from PR code-yeongyu#638: P1: Add tool_result type to validateSessionHasOutput() to prevent false negatives for tool-only background tasks that would otherwise timeout after 30 minutes despite having valid results. P2: Add pendingByParent tracking to resume() and registerExternalTask() to prevent premature 'ALL COMPLETE' notifications when mixing launched and resumed tasks.
…/external tasks Addresses code review feedback from PR code-yeongyu#638: P1: Add tool_result type to validateSessionHasOutput() to prevent false negatives for tool-only background tasks that would otherwise timeout after 30 minutes despite having valid results. P2: Add pendingByParent tracking to resume() and registerExternalTask() to prevent premature 'ALL COMPLETE' notifications when mixing launched and resumed tasks.
…ion (#638) * fix: background task completion detection and silent notifications - Fix TS2742 by adding explicit ToolDefinition type annotations - Add stability detection (3 consecutive stable polls after 10s minimum) - Remove early continue when sessionStatus is undefined - Add silent notification system via tool.execute.after hook injection - Change task retention from 200ms to 5 minutes for background_output retrieval - Fix formatTaskResult to sort messages by time descending Fixes hanging background tasks that never complete due to missing sessionStatus. * fix: improve background task completion detection and message extraction - Add stability-based completion detection (10s min + 3 stable polls) - Fix message extraction to recognize 'reasoning' parts from thinking models - Switch from promptAsync() to prompt() for proper agent initialization - Remove model parameter from prompt body (use agent's configured model) - Add fire-and-forget prompt pattern for sisyphus_task sync mode - Add silent notification via tool.execute.after hook injection - Fix indentation issues in manager.ts and index.ts Incorporates fixes from: - PR #592: Stability detection mechanism - PR #610: Model parameter passing (partially) - PR #628: Completion detection improvements Known limitation: Thinking models (e.g. claude-*-thinking-*) cause JSON Parse errors in child sessions. Use non-thinking models for background agents until OpenCode core resolves this. * fix: add tool_result handling and pendingByParent tracking for resume/external tasks Addresses code review feedback from PR #638: P1: Add tool_result type to validateSessionHasOutput() to prevent false negatives for tool-only background tasks that would otherwise timeout after 30 minutes despite having valid results. P2: Add pendingByParent tracking to resume() and registerExternalTask() to prevent premature 'ALL COMPLETE' notifications when mixing launched and resumed tasks. * fix: address code review feedback - log messages, model passthrough, sorting, race condition - Fix misleading log messages: 'promptAsync' -> 'prompt (fire-and-forget)' - Restore model passthrough in launch() for Sisyphus category configs - Fix call-omo-agent sorting: use time.created number instead of String(time) - Fix race condition: check promptError inside polling loop, not just after 100ms
Summary
This PR consolidates multiple fixes for background task completion detection, message extraction, and notification systems, making background agents reliable with non-thinking models.
Changes
1. Stability-Based Completion Detection
lastMsgCountandstablePollstracking onBackgroundTasktypeMIN_STABILITY_TIME_MS) + 3 consecutive polls with unchanged message countsession.idlefires before agent respondsvalidateSessionHasOutput()edge guard to verify actual content exists2. Session Notification System with
noReplysession.prompt({ noReply: true })for silent notificationspendingByParent: Map<string, Set<string>>to track pending tasks per parent sessionnoReply: true(silent injection)noReply: falseto trigger AI responsependingNotificationsMap +tool.execute.afterinjection approach3. Message Extraction Fix for Thinking Models
type: "reasoning"for thinking model output, nottype: "text"type: "tool"for tool call resultsmanager.ts→validateSessionHasOutput()- checks"text","reasoning","tool"background-task/tools.ts→ message extractioncall-omo-agent/tools.ts→ message extractionsisyphus-task/tools.ts→ message extraction4. Prompt Method Fix
promptAsync()toprompt()(fire-and-forget) inBackgroundManagerpromptAsync()just queues without executing;prompt()properly initializes agent loop5. Sisyphus Sync Mode Fix
await prompt()to fire-and-forgetprompt()+ stability pollingawaitonsession.prompt()causes JSON parse errors with thinking models6. Model Parameter Removal
modelparameter from prompt body inlaunch()andresume()7. Edge Guards for Premature Completion
MIN_IDLE_TIME_MS(5s) minimum wait before acceptingsession.idleeventsvalidateSessionHasOutput()checks for actual assistant/tool messages with content8. 5-Minute Task Retention
TASK_RETENTION_MS) forbackground_outputretrievalReferenced PRs
Files Changed
manager.tsvalidateSessionHasOutput(),prompt()instead ofpromptAsync(),noReplynotifications,pendingByParenttrackingbackground-task/tools.ts"reasoning"part extractioncall-omo-agent/tools.ts"reasoning"part extractionsisyphus-task/tools.ts"reasoning"extraction, fire-and-forget prompt, stability pollingbackground-notification/index.tsindex.tsschema.tstypes.tslastMsgCount,stablePollstoBackgroundTaskTest Results
gemini-3-flashgemini-3-flashclaude-opus-4-5-thinking-maxclaude-sonnet-4-5-thinking-lowThinking models cause
JSON Parse error: Unexpected EOFin child sessions.This affects all agents configured with
-thinking-*model variants (e.g.,claude-opus-4-5-thinking-max,claude-sonnet-4-5-thinking-low).Root cause: The streaming response format from thinking models isn't compatible with how
session.prompt()parses responses in child sessions.Workaround: Use non-thinking models for background agents in
oh-my-opencode.jsonc:Key Constants
MIN_STABILITY_TIME_MSMIN_IDLE_TIME_MSSTABILITY_POLLS_REQUIREDTASK_RETENTION_MSPOLL_INTERVAL_MSChecklist
bun run build)bun run typecheck)